rank | frequency | n-gram |
---|---|---|
1 | 16200 | -а |
2 | 10155 | -е |
3 | 9963 | -и |
4 | 7840 | -т |
5 | 6184 | -о |
rank | frequency | n-gram |
---|---|---|
1 | 5528 | -те |
2 | 5220 | -та |
3 | 4570 | -от |
4 | 2704 | -ни |
5 | 2588 | -на |
rank | frequency | n-gram |
---|---|---|
1 | 4636 | -ите |
2 | 4284 | -ата |
3 | 2031 | -иот |
4 | 1306 | -ија |
5 | 1235 | -ски |
rank | frequency | n-gram |
---|---|---|
1 | 1383 | -ните |
2 | 1253 | -ната |
3 | 1088 | -ката |
4 | 1069 | -ниот |
5 | 756 | -ките |
rank | frequency | n-gram |
---|---|---|
1 | 592 | -ањето |
2 | 560 | -ијата |
3 | 542 | -ување |
4 | 528 | -скиот |
5 | 521 | -ската |
The tables show the most frequent letter-N-grams at the ending of words for N=1…5. Everything runs in parallel to 2.2.5 Most frequent word beginnings. The aim is suffix detection instead of affix detection.
For N=3:
SELECT @pos:=(@pos+1), xx.* from (SELECT @pos:=0) r, (select count(*) as cnt ,concat("-", right(word,3)) FROM words WHERE w_id>100 group by right(word,3) order by cnt desc) xx limit 5;
2.2.5 Most frequent word beginnings